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Abstract 

Motivated by recent work on ordinal embedding (Kleindessner and von Luxbnrg, 2014), we 
derive large sample consistency results and rates of convergence for the problem of embedding 
points based on triple or quadruple distance comparisons. We also consider a variant of this 
problem where only local comparisons are provided. Finally, inspired by (Jamieson and Nowak, 
2011), we bound the number of such comparisons needed to achieve consistency. 

Keywords: ordinal embedding, non-metric multidimensional scaling (MDS), dissimilarity com¬ 
parisons, landmark multidimensional scaling. 


1 Introduction 

The problem of ordinal embedding, also called non-metric multidimensional scaling (Borg and Groenen, 
2005), consists of finding an embedding of a set of items based on pairwise distance comparisons. 
Specifically, suppose that 6ij > 0 is some dissimilarity measure between items i,j e [n] ;= {!,... ,n}. 

We assume that da = 0 and 5ij = 5ji for all i,j e [n]. These dissimilarities are either directly available 
but assumed to lack meaning except for their relative magnitudes, or only available via comparisons 
with some other dissimilarities, meaning that we are only provided with a subset C c [n]^ such that 

Sij<6k£, €C. (1) 

Note that the latter setting encompasses the former. Given C and a dimension d, the goal is to 
embed the items as points pi,... e in a way that is compatible with the available information, 
specifically 

Sij<6k£ => \\Pi-Pj\\ <\\Pk-P£\\, (2) 

where || • || denotes the Euclidean norm. The two most common situations are when all the quadruple 
comparisons are available, meaning C = [n]^, or all triple comparisons are available, meaning 
C = {{i,j,i,k) : i,j,k e [n]}, which can be identified with [n]^. This problem has a long history 
surveyed in (Young and Hamer, 1987), with pioneering contributions from Shepard (1962a,b) and 
Kruskal (1964). 

The main question we tackle here is that of consistency. Suppose that the items are in fact points 
xi,..., Xn e and 5ij = \\xi-Xj ||. (When the hjj’s are available, suppose that Sij = g{ \\xi-Xj ||) where 
g is an unknown increasing function.) Provided with a subset C - Cn of dissimilarity comparisons as 
in (2), is it possible to reconstruct the original points in the large-sample limit n oo? Glearly, the 
reconstruction can only be up to a similarity transformation — that is, a transformation / : R'^ R'^ 

such that, for some A > 0, ||/(x) - /(y)|| = A||x - y\\ for all x,y € R'^, or equivalently, of the form 
f{x) = XR{x) + b where R is an orthogonal transformation and 6 is a constant vector — since such 
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a transformation leaves the distance comparisons unchanged. This question is at the foundation of 
non-metric multidimensional scaling. 

Early work only addressed the continuous case, where the x’s span a whole convex subset 
U c M'^. In that setting, the goal becomes to characterize isotonic functions on U, that is, functions 
/ : [/ satisfying 

\\x-y\\ <\\x'-y'W ^ ||/(a;)-/(y)|| < ||/(a:')-/(y')ll, Vx,y,x',y'e C/. (3) 

Shepard (1966) argues that such functions must be similarities, and cites earlier work (Aumann and Kruskal, 

1958; Suppes and Winet, 1955) dealing with the case d-1. 

Only recently has the finite sample case been formally considered. Indeed, Kleindessner and von Luxburg 
(2014) prove a consistency result, showing that if xi ,... ,Xn c where 1/ is a bounded, con¬ 
nected, and open subset of satisfying some additional conditions — for example, a finite union 
of open balls — and C - [n]^, then in the large sample limit with xi,. .. ,Xn becoming dense in U, 
it is possible to recover the x’s up to a similarity transformation. (Note that U is then uniquely 
defined as the interior of {xi '■ i> 1}.) We note that Kleindessner and von Luxburg (2014) focus on 
the strictly isotonic case, where the second inequality in (3) is strict. Our first contribution is an 
extension of this consistency result for quadruple learning to triple learning where C = [n]^. In the 
process, we greatly simplify the arguments of Kleindessner and von Luxburg (2014) and weaken the 
conditions on the sampling domain U. We note that Terada and Von Luxburg (2014) have partially 
solved this problem by a reduction to the problem of embedding a nearest-neighbor graph. How¬ 
ever, their arguments are based on an apparently incomplete proof in (Von Luxburg and Alamgir, 

2013), which is itself based on a rather sophisticated approach. Our proofs are comparatively much 
simpler and direct. 

Our second contribution is to provide rates of convergence, a problem left open by Kleindessner and von Luxbur 
(2014). In the context of quadruple learning, we obtain a rate in 0{en), where Sn is the Hausdorff 
distance between the underlying sample {xi,... ,Xn} and U, meaning, Sn ■= sup,j,g(y min^g^^] ||x-Xi||. 

This is the first convergence rate for exact ordinal embedding that we know of. (We are not able 
to obtain the same rate in the context of triple learning.) Compared to establishing consistency, 
the proof is much more involved. 

The last decade has seen a surge of interest in ordinal embedding, motivated by applications 
to recommender systems and large-scale psychometric studies made available via the internet, for 
example, databases for music artists similarity (Ellis et ah, 2002; McEee and Lanckriet, 2011). Sen¬ 
sor localization (Nhat et ah, 2008) is another possible application. Modern datasets being large, 
all quadruple or triple comparisons are rarely available, motivating the proposal of embedding 
methods based on a sparse set of comparisons (Agarwal et ah, 2007; Borg and Groenen, 2005; 

Jamieson and Nowak, 2011; Terada and Von Luxburg, 2014). Terada and Von Luxburg (2014) 
study what they call local ordinal embedding^ which they define as the problem of embedding 
an unweighted K-nearest neighbor (ih-NN) graph. With our notation, this is the situation where 
C = {{i,j,k) ■■ Sij < 6i(^K) < ^ik}, being the dissimilarity between item i and its Kth nearest- 

neighbor. Terada and Von Luxburg (2014) argue that, when the items are points xi,..., x„ sampled 
from a smooth density on a bounded, connected, convex, and open subset U c with smooth 
boundary, then K = Kn » is enough for consistency. Our third contribution 

is to consider the related situation where C - {{i,j,k,£) : 6ij < 6ki and max(Jij, Jj^) < 
which provides us with the K-NN graph and also all the quadruple comparisons between the nearest 
neighbors. In this setting, we are only able to show that Kn » \/n log n is enough. 

Beyond local designs, which may not be feasible in some settings, Jamieson and Nowak (2011) 
consider the problem of adaptively (i.e., sequentially) selecting triple comparisons in order to min¬ 
imize the number of such comparisons and yet deduce all the other triple comparisons. They 
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consider a few methods, among which a non-metric version of the landmark MDS method of 
De Silva and Tenenbaum (2004). Less ambitious is the problem of selecting few comparisons in 
order to consistently embed the items when these are points in a Euclidean space. Our fourth 
contribution is to show that one can obtain a consistent embedding with a landmark design based 
on a^n queries, where an is any diverging sequence. Moreover, the embedding can be computed in 
(expected) time C,{an)n, for some function C, '■ M+ 

The rest of the paper is organized as follows. In Section 2, we state our theoretical results and 
prove the simpler ones. We then gather the remaining proofs in Section 3. Section 4 concludes the 
paper with a short discussion. 


2 Theory 

In this section we present our theoretical findings. Most proofs are gathered in Section 3. 

We already defined isotonic functions in (3). Following (Kleindessner and von Luxburg, 2014), 
we say that a function / : [7 c is weakly isotonic if 

\\x-y\\<\\x-z\\ => \\f{x) - f{y)\\<\\f{x) - f{z)l \fx,y,z€U. (4) 

Obviously, if a function is isotonic (3), then it is weakly isotonic (4). Weak isotonicity is in fact not 
much weaker than isotonicity. Indeed, let P be a property (e.g., ‘isotonic’), and say that a function 
/ : [/ c 1 -^ has the property P locally if for each x €U there is r > 0 such that / has property 
P on B(x,r) n U, where B(x,r) denotes the open ball with center x and radius r. 

Lemma 1. Any locally weakly isotonic function on an open U is also locally isotonic on U. 

Proof. This is an immediate consequence of (Kleindessner and von Luxburg, 2014, Lem 6), which 


implies that a weakly isotonic function 

on B{x,r) is isotonic on B{x,rl4). 

□ 

Suppose we have data points xi,... 

,Xn e Define 


II 

? Xn} ; Cl = Cln — {^Xn • n ^ 1}. 

n>l 

(5) 


Let ^ij - II Xi - Xj\ and suppose that we are only provided with a subset Cn c [n]^ of distance 
comparisons as in (1). To an (exact) ordinal embedding p : [n] M'’* — which by definition satisfies 

(2) — we associate the map cfn'-^n'-^ 1^^ defined by 4>n{xi) = pi for all i e [re]. We crucially observe 
that, in the case of all quadruple comparisons {Cn = [?^]^), the resulting map (fn is isotonic on in 
the case of all triple comparisons {Cn = [?^]^)j 4>n is only weakly isotonic on instead. In light of 
this, and the fact that the location, orientation and scale are all lost when only ordinal information 
is available, the problem of proving consistency of (exact) ordinal embedding reduces to showing 
that any such embedding is close to a similarity transformation as the sample size increases, re ^ oo. 
This is exactly what Kleindessner and von Luxburg (2014) do under some assumptions. 

2.1 Ordinal embedding based on all triple comparisons 

Our first contribution is to extend the consistency results of Kleindessner and von Luxburg (2014) 
on quadruple learning to triple learning. Following their presentation, we start with a result where 
the sample is inhnite, which is only a mild generalization of (Kleindessner and von Luxburg, 2014, 
Th 3). 
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Theorem 1. Let U c be bounded, connected and open. Suppose Ll is dense in U and consider 
a locally weakly isotonic function (j) ■■ Ll . Then there is a similarity transformation S that 
coincides with (p on Ll. 

The proof is largely based on that of (Kleindessner and von Luxburg, 2014, Th 3), but a bit 
simpler; see Section 3.1. 

We remark that there can only be one similarity with the above property, since similarities are 
affine transformations, and two affine transformations of that coincide on d+1 affine independent 
points are necessarily identical. 

In this theorem, the set is dense in an open subset of and therefore infinite. In fact, 
Kleindessner and von Luxburg (2014) use this theorem as an intermediary result for proving con¬ 
sistency as the sample size increases. Most of their paper is dedicated to establishing this, as their 
arguments are quite elaborate. We found a more direct route by ‘tending to the limit as soon as 
possible’, based on Lemma 2 below, which is at the core of the Arzela-Ascoli theorem. 

For the remaining of this section, we consider the finite sample setting: 

U is bounded, connected and open, 

= {xi, ■ ■ ■ ,Xn} c 17 is such that 11 := {xn : n > 1} is dense in 17, (6) 

and (pn ■ fin Q c M”* is a function with values in a bounded set Q. 

In the context of (6), we implicitly extend (pn to fl, for example, by setting pn{x) = q for all 
X e fl \ Lin, where g is a given point in Q, although the following holds for any extension. 

Lemma 2. Consider Lin finite and (pn '■ Lin ^ Q K'’*, where Q is bounded. Then there is 
N cN infinite such that (p{x) := lim^^Ar pn{x) exists for all x ^Ll := Un ^n- 

This is called the diagonal process in (Kelley, 1975, Problem D, Ch 7). Although the result is 
classical, we provide a proof for completeness. 

Proof. Without loss of generality, suppose Lin = {xi, ■. ■ ,Xn}- Let Nq = N. Since {(pn{xi) ■ n e 
Nq) e Q and Q is bounded, there is A^i c Nq infinite such that lim^gTVi 4>n{xi) exists. In turn, since 
{P>n{x 2 ) '■ n e A^i) is bounded, there is N 2 c Ni infinite such that lim^gTv^ (pn{x 2 ) exists. Continuing 
this process — which formally corresponds to a recursion — we obtain ••• c Nk+i c Nk c ••• c c 
A^o = ^ such that, for all k, is infinite and lim„g 7 Vfc <Pn{xk) exists. Let denote the /cth element 
(in increasing order) of and note that (n^ : A; > 1) is strictly increasing. Dehne N = {uk ■ k > 1}. 
Since {np,p > k} c Nj^, we have lim^gjv 4>n{xk) = lim^gTVfc <Pn{xk), and this is valid for all A: > 1. □ 

Corollary 1. Consider the setting (6) and assume that pn is weakly isotonic. Then (pn) is 
sequentially pre-compact for the pointwise convergence topology for functions on Q and all the 
functions where it accumulates are similarity transformations restricted to 17. 

The corresponding result (Kleindessner and von Luxburg, 2014, Th 4) was obtained for isotonic 
(instead of weakly isotonic) functions and for domains U that are finite unions of balls, and the 
convergence was uniform instead of pointwise. For now, we provide a proof of Corollary 1, which 
we derive as a simple consequence of Theorem 1 and Lemma 2. 

Proof. Lemma 2 implies that (pn) is sequentially pre-compact for the pointwise convergence topol¬ 
ogy. Let p be an accumulation point of {pn), meaning that there is A^ c N infinite such that 
p{x) = \im.n^N pn{x) for all x ^Ll. Take x,y,z ^Ll such that ||x - y\\ < ||x - z\. By definition, there 
is m such that x,y,z € Llm, and therefore \\pn{x) - pn{y)\\ < \\Pn{x) - </>n(-s)|| for all n>m. Passing 
to the limit along n € N, we obtain \\p{x) - p{y)\\ < 111^(3^) - </>('S)||- Hence, p is weakly isotonic on 
Ll and, by Theorem 1, it is therefore the restriction of a similarity transformation to Ll. □ 
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It is true that (Kleindessner and von Luxburg, 2014, Th 4) establishes a uniform convergence 
result. We do the same in Theorem 2 below, but with much simpler arguments. The key are 
the following two results bounding the modulus of continuity of a (resp. weakly) isotonic function. 
We note that the second result (for weakly isotonic functions) is very weak but sufficient for our 
purposes here. For A c 1/ c define 6h{^,V) - sup^^^y inf||y - a:||, which is their Hausdorff 
distance. We say that (y* : i e /) c is an ry-packing if \\yi - yj\\ > rj for all i + j. We recall that 
the size of the largest y-packing of a Euclidean ball of radius r is of exact order For a set 

V c let diam(I4) = sup^, ||x - y|| be its diameter and let 

p{V) = argsup{3u e V : i?(u,r/2) c V}, (7) 

^>0 

which is the diameter of a largest ball inscribed in V. 

Everywhere in the paper, d is fixed, and in fact implicitly small as we assume repeatedly that 
the sample (of size n) is dense in a full-dimensional domain of In particular, all the implicit 
constants of proportionality that follow depend solely on d. 

Lemma 3. Let E c be open. Consider A c E and set e = V). Let -■ Q he isotonic, 

where Q c is bounded. There is C oc diam(Q)//9(E), such that 

Wijj^x) - ^lJ{x')\\ < C{\\x - x'W + e), Vxjx'eA. (8) 

Proof. The proof is based on the fact that an isotonic function transforms a packing into a packing. 
Take x,x' € A such that ^ := HV'Cic) - 'ip{x')\\ > 0, and let y = ||x - a;'||. Since E is open it contains 
an open ball of diameter p{V). Let yi,... ,ym be an {y + 3e)-packing of such a ball with m > 
Ci{p{V)l{y + e)Y for some constant Ci depending only on d. Then let xi,... ,Xm ^ A such that 
maxj II yj - Xi|| < e. By the triangle inequality, for all i j we have 

\\xi-Xj\\ > ||yi-yj|| -2e>y + e> ||x-x'||. 

Because is isotonic, we have \\'4>{xi) - '4>{xj)\\ > so that ^^{xi),... ,'il}{xm) is a ^-packing. 
Therefore, there is a constant C 2 depending only on d such that m < C'2(diam((5)/^)'^. We conclude 
that < {C 2 lCi)^^'^{diam{Q)lp{V)){y + e). □ 

For E c and h > 0, let E^ = {x e E : 3y e E s.t. x e B{y,h) c E}. We note that E^ is the 
complement of the /i-convex hull of ■= \ E — see (Cuevas et ah, 2012) and references therein. 

Lemma 4. In the context of Lemma 3, if is only weakly isotonic, then there is C oc diam((5), 
such that for all h> 0, 

||'0(x) - f^ix')]] < C{\x - x'Hh + Vx e A n E^, Vx'e A. (9) 

Proof. Assume that E^ 0, for otherwise there is nothing to prove. Take x e A n E^ and x' e A 
such that ^ := ||'i/’(x) - 'if{x')\ > 0, and let y = ||x - x'||. Because is bounded, it is enough to prove 
the result when y,£ < /i/2. Let y e E be such that x e B(y,h) c E. There is y' e B(y,h) such 
that y e [xy'] and ||x - y'|| > 2/i/3. Define u = {y' - x)/||y' - x||. Let zq = x, and for j > 1, define 
Zj = Zj-i + (y + 5je)tt. Let /c > 0 be maximum such that Ej=i(y + 5ye) < /i/2. Since k satisfies 
ky + bk‘^£ > /i/2, we have k > min(/i/(4y), y^/i/(10e)). By construction, for all j e [k], Zj e [xy'] and 
B{zj,2£) c B(y,h). Let x_i = x',xo = x and take xi,... ,Xfc e A such that maxj ||xj - Zj\\ < e. By 
the triangle inequality, for j = 2,..., k, 

IIXj - Xj_i|| > \\zj - Zj-i\\ -2s > \\zj-i - Zj- 2 \\ + 3e > ||xj_i - Xj_ 2 || + £, 
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which implies by induction that 

\\xj - Xj-i\ > ||xi - xoll + e > ll^^i - zoll = ?/ + 5e > ||x - x'\. 

By weak isotonicity, this implies that \'4>{xj) -'0(xj_i)|| > ||V'(3:) -'0(x')|| = We also have, for 
any i,j e [fc] such that 1 < z < j - 2, 

IIXj - Xj|| > \zj - ZjII - 2e > \zj - Zj-i \ +rj + be -2e> \xj - Xj-i \ +ri + e. 

By weak isotonicity, this implies that ||'i/)(xj) - 'i/;(xi)|| > for all 0 < i < j < A:. 

Consequently, {ip{xj) ■ j e [A:]) forms a ^-packing of Q. Hence, k < C'{diam.{Q)l^)'^, for some 
constant C. We conclude with the lower bound on k. □ 

From this control on the modulus of continuity, we obtain a stronger version of Corollary 1. 

Theorem 2. Under the same conditions as Corollary 1, we have the stronger conclusion that there 
is a sequence (Sn) of similarities such that, for all h > 0, max^-^Q^pf;;, ||i;^„(x) - £'„(x)|| ^ 0 as 
n oo. If in fact each (l)n is isotonic, then this remains true when h = 0. 

We remark that when U is a connected union of a possibly uncountable number of open balls of 
radius at least h> 0, then U = U^. This covers the case of a finite union of open balls considered in 
(Kleindessner and von Luxburg, 2014). We also note that, if U is bounded and open, and dU has 
bounded curvature, then there is h > 0 such that U = U^. This follows from the fact that, in this 
case, has positive reach (Federer, 1959), and is therefore Ai-convex when h is below the reach 
by^ (Cuevas et ah, 2012, Prop 1). Moreover, our arguments can be modified to accommodate sets 
U with boundaries that are only Lipschitz, by reasoning with wedges in Lemma 4. 

Theorem 2 now contains (Kleindessner and von Luxburg, 2014, Th 4), and extends it to weakly 
isotonic functions and to more general domains U. Overall, our proof technique is much simpler, 
shorter, and elementary. 

Define Sn = which quantifies the density of in U. Because c and H is 

dense in U, we have Sn 0 as re ^ oo. 

Proof. Let f be an accumulation point of {4>n) for the pointwise convergence topology, meaning 
there is c N infinite such that (i>{x) = lim^gTv'^n(x) for all x e H. We show that, in fact, the 
convergence is uniform. 

First, suppose that each (pn is isotonic. In that case. Lemma 3 implies the existence of a constant 
C > 0 such that \\4>n{x)-(j)n{x')\\ < C(||x-x'||+e„) for all x,x' e and for all re. Passing to the limit 
along n € N, we get ||(/>(x) - p{x')\\ < C\x -x'|| for all x,x' e H. (In fact, we already knew this from 
Corollary 1, since we learned there that p coincides with a similarity, and is therefore Lipschitz.) 
Fix e > 0. There is m such that Em ^ Then there ism' >m such that maxjgj-^] ||(/>„(xj)-i^(xj)|| < e 
for all re e with re > m'. For such an re, and x e let i e [rre] be such that ||x - Xj|| < Em- By the 
triangle inequality, 

\\pn{x) - p{x)\\ < \\pn{x) - pn{Xi)\\ + \\pn{Xi) - p{Xi)\\ + \\p{Xi) - p{x)\\ 

< C(||x- Xill +En) + \\Pn{Xi) - P{Xi)\\ + CHx* -x|| 

^ C{Em + £n) + E + CEm < (3(17 + 1)e. 

Since x e H is arbitrary and e can be taken as small as desired, this shows that the sequence 
{pn • n e N) convergences uniformly to p over (H„ : re e N). 

^This proposition is stated for compact sets (which is not the case of U'^) but easily extends to the case where set 
is closed with compact boundary 
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When the 4>n are only weakly isotonic, we use Lemma 4 to get a constant C > 0 depending on 
diam((5) and h > 0 such that \\4>n{x) - 4>n{x')\\ < C(||a: - x'\ + for all x ^ and all 

x' e and for all n. Passing to the limit along n€ N, we get \\4>{x) - 4>{x')\\ < C\x- for all 

x,x' e 14. (In fact, ||(?!)(x) -(f){x')\ < Cllx-x'll for all x,x' eQ from Corollary 1, as explained above.) 
The rest of the arguments are completely parallel. We conclude that ((pn ■ n e N) convergences 
uniformly to cj) over (n„ n ■ n e N). 

Let S denote the similarities of M'^. For any functions (/>,^ : M'^, define 6n{(t>,'4’) = 

vii&yi^^^^^jjh\(j){x) -'i/’(x)||, and also 5n{(p^^) = iaf^es5n(0)<5). Our end goal is to show that 
5n{4>n,<S) ^ 0 as n ^ oo. Suppose this is not the case, so that there is ?? > 0 and N c 'M 
infinite such that 5n{(t>n,S) > r/ for all n e N. By Corollary 1, there is W ^ and S e S 
such that S{x) = lim^eATj 4>n{x) for all x e 14. As we showed above, the convergence is in fact 
uniform over (14„ n : n e A^i), meaning Sn{4>n, S) = 0. At the same time, we have 

6n{4>n,S) > 6n{4>n,'S) > Tj. We therefore have a contradiction. □ 

2.2 Rates of convergence 

Beyond consistency, we are able to derive convergence rates. We do so for the isotonic case, i.e., 
the quadruple comparison setting. Recall that - 6H{^n,U). 

Theorem 3. Consider the setting (6) with (pn isotonic. There is C depending only on (d,U), and 
a sequence of similarities Sn such that \\4>n{x) - 5n(x)|| < C'diam((5)e„. If U = for 

some h> 0, then C - Cj diam([/) where C is a function of {d, hj diam([/), /o(f7)/ diam([/)). 

The proof of Theorem 3 is substantially more technical than the previous results, and thus 
postponed to Section 3. Although Kleindessner and von Luxburg (2014) are not able to obtain 
rates of convergence, the proof of Theorem 3 bares resemblance to their proof technique, and in 
particular, is also based on a result of Alestalo et al. (2001) on the approximation of e-isometries', 
see Lemma 18. We will also make use of a related result of Vestfrid (2003) on the approximation of 
approximately midlinear functions; see Lemma 17. We mention that we know of a more elementary 
proof that only makes use of (Alestalo et ah, 2001), but yields a slightly slower rate of convergence. 

We note that there is a constant c depending only on d such that > cn~^^‘^. This is because 
U being open, it contains an open ball, and this lower bound trivially holds for an open ball. And 
such a lower bound is achieved when the xfs are roughly regularly spread out over U. If instead 
the xfs are iid uniform in U, and U is sufficiently regular — for example, U - for some h> 0 — 
then En = 0(log(n)/n)^/'^, as is well-known. This would give the rate, and we do not know whether 
it is optimal, even in dimension d= 1. 

Remark 1. We are only able to get a rate in for the weakly isotonic case. We can do so by 
adapting of the arguments underlying Theorem 3, but only after assuming that U - for some 
h> 0 and resolving a few additional complications. 

2.3 Ordinal embedding with local comparisons 

Terada and Von Luxburg (2014) consider the problem of embedding an unweighted nearest-neighbor 
graph, which as we saw in the Introduction, is a special case of ordinal embedding. Their argu¬ 
ments — which, as we explained earlier, seem incomplete at the time of writing — indicate that 
K = Kn» 712/(2+-^) (log yd,l( 2 +d) -g gj^ough for consistently embedding a AT-NN graph. 

We consider here a situation where we have more information, specifically, all the distance 
comparisons between AT-nearest-neighbors. Formally, this is the situation where 

Cn = \^{i,j,k,i) : 6ij < 5 m and {j,k,i] c Vx„(i)|, 
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where Np^i) denotes the set of the K items nearest item i. If the items are points Vln = 
{xi,... ,Xn} <= an exact ordinal embedding is only constrained to be locally weakly iso¬ 
tonic as we explain now. We start by stating a standard result which relates a i^-NN graph to an 
r-ball graph. 

Lemma 5. Let U c be bounded, connected and open, and such that U = for some h > 0. 
Sample xi,...,Xn iid from a density f supported on U with (essential) range in (0, oo) strictly. 
There is a constant C such that, if K := [nr'^] > Clogn, then with probability tending to 1, 

Neigh^/2(a^i) •= {^j ■ \\xj - XiH < r} c Neigh2;^(xi), Si e [n], 

where Neigh^(xi) denotes the set of the K points in {xj : j e [n]} nearest Xj. 

The proof is postponed to Section 3 and only provided for completeness. Therefore, assuming 
that K > Clogn, where C is the constant of Lemma 5, we may equivalently consider the case where 

Cn = {{i,j:k,i) ■■ 6ij < 5 m and vaayi{5ij,5ik,5u) < r„|, 

for some given > 0. An exact embedding (fn ■ Lin 1^*^ hr that case is isotonic on Lin ^ B(x,rn) 
for any x e Lin- We require in addition that 

||x-x'll < < ||x'*'-X^ll ^ \(t)n{x)-(l)n{x')\<\(t)n{x^)-(t>n{x^)\-, SX,x',x\x^ ^ Lin- (10) 

This is a reasonable requirement since it is possible to infer it from C„. Indeed, for k,i € [n], we 
have 5m < ^n if, and only if, (k,k,k,£) e Cn or {£,£,£,k) e Cn- (Here we assume that = 0 for 
all i and 5ij > 0 if i j, as is the case for Euclidean distances.) We can still infer this even if the 
quadruples in Cn must include at least three distinct items. Indeed, suppose k,£ ^ [n] are such 

that there is no i such that {i,k,i,£) e Cn or (i,£,i,k) e Cn, then (a) 5ik = 5ii for all i such that 

max{5ik,5i£) < r„, or (b) 5m ^ ’’n- Assume that > Ce„ with C > 0 sufficiently large, so that 

situation (a) does not happen. Conversely, if {k,£) is such that 5m < fn, then when (a) does not 

happen, there is i such that {i,k,i,£) €Cn or {i,£,i,k) ^Cn- 

Theorem 4. Consider the setting (6) and assume in addition that U - for some h> 0, and that 
4>n is isotonic over balls of radius and satisfies (10). There is a constant C > 0 depending only on 
(d,h,/9([/),diam([/),diam((5)) and similarities Sn such that max^j^o^ \\4’n{x) -S'n(®)|| < Cen/r^. 

Assume the data points are generated as in Lemma 5. In that case, we have En - 0(log(n)/n)^/'^ 
and Theorem 4 implies consistency when » (log(n)/n)^'^^‘^. By Lemma 5, this corresponds to 
the situation where we are provided with comparisons among AT^-nearest neighbors with Kn » 
\/n logn. If the result of Terada and Von Luxburg (2014) holds in all rigor, then this is a rather 
weak result. 

2.4 Landmark ordinal embedding 

Inspired by (Jamieson and Nowak, 2011), we consider the situation where there are landmark items 
indexed by c [ra], and we are given all distance comparisons from any point to the landmarks. 
Formally, with triple comparisons, this corresponds to the situation where 

Cn = ^{i,j,k') € [n] X . 5ij < 

If the items are points Lin = {a^i,... ,Xn} c an exact ordinal embedding (fn is only constrained 
to be weakly isotonic on the set of landmarks and, in addition, is required to respect the ordering of 
the distances from any point to the landmarks. The following is an easy consequence of Theorem 2. 
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Corollary 2. Theorem 2 remains valid in the landmark triple comparisons setting (meaning with 
(pn as just described) as long as the landmarks become dense in U. 

Jamieson and Nowak (2011) study the number of triple comparisons that are needed for exact 
ordinal embedding. With a counting argument, they show that at least Cn log n comparisons are 
needed, where C is a constant depending only on d. If we only insist that the embedding respects 
the comparisons that are provided, then Corollary 2 implies that a landmark design is able to 
be consistent as long as the landmarks become dense in U. This consistency implies that, as the 
sample size increases, an embedding that respects the landmark comparisons also respects all other 
comparisons approximately. This is achieved with 0(n^^ + £^) triple comparisons, where in ■= \Ln\ 
is the number of landmarks, and the conditions of Corollary 2 can be fulfilled with in ^ oo at any 
speed, so that the number of comparisons is nearly linear in n. 

Proof. We focus on the weakly isotonic case, where we assume that U = for some h> 0. Let = 
{xi ■ I € Ln} denote the set of landmarks. Since becomes dense in U, meaning pn ■= ^/^(An, U) 

0, by Theorem 2, there is a sequence of similarities Sn such that Cn ■= ^ 0. 

Now, for X e let X e A„ such that ||x - x|| <rjn- We have 

\\(j)n{x) - 5'„(x)|| < Wcjnix) - (j)n{x)\\ + Wcjnix) - <S'„(x)|| + ||5n(x) - S'„(x)||. (11) 

The first term is bounded by by Lemma 4, for some constant C. The middle term is 

bounded by Cn- For the third term, express Sn in the form Sn{x) = f3nRn{x) + bn, where f3n ^ Rn 
is an orthogonal transformation, and e Take two distinct landmarks x^,x^ e A„ such that 
II x^ - 3;^ II > diam([/)/2, which exist when n is sufficiently large. Since 

||S'n(x^) - 5n(x^)|| = /3n||a:^ “ > /3n diam(17)/2 


and, at the same time, 

||S’„(x^) - 5n(x^)|| < ||5’„(x^) -())n(x^)|| + Wcfnix"^) - (fn{x^)\\ + \\f}n{x^) - Sn{x^)\\ 

< Cn + diam((5) + Cn ^ 2diam(Q), eventually, 

we have < /3 := 4diam(Q)/diam([/). Hence, the third term on the RHS of (11) is bounded by 
^rjn- Thus, the RHS of (11) is bounded by + Cn + PVn, which tends to 0 as re ^ 00 . This 

being valid for any x € iln, we conclude. □ 

We remark that at the very end of the proof, we obtained a rate of convergence as a function 
of the density of the landmarks and the convergence rate implicit in Theorem 2. This leads to the 
following rate for the quadruple comparisons setting, which corresponds to the situation where 

Cn — I^(lJ)^) ^ [^] ^ Ln ■ ^ij ^ l_J |^(bJ)^)^) ^ ■ 5ij < Jjfe 

Here, 4>n is constrained to be isotonic on the set of landmarks and, as before, is required to respect 
the ordering of the distances from any data point to the landmarks. 

Corollary 3. Consider the setting (6) in the landmark quadruple comparisons setting (meaning 
with (jn o,s just described). Let An denote the set of landmarks and set rjn = Jj^(A„, U). There is a 
eonstant C > 0 and a sequenee of similarities Sn sueh that maxx^n.^ \\C)n{x) - 5n(x)|| < Crjn. 

Proof. The proof is parallel to that of Corollary 2. Here, we apply Theorem 3 to get Cn < Cor]n. 
This bounds the second term on the RHS of (11). The first term is bounded by Cipn by Lemma 3, 
while the third term is bounded by /drjn as before. {Co,Ci are constants.) □ 
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Computational complexity. We now discuss the computational complexity of ordinal embedding 
with a landmark design. The obvious approach has two stages. In the first stage, the landmarks 
are embedded. This is the goal of (Agarwal et ah, 2007), for example. Here, we use brute force. 

Proposition 1. Suppose thatm items are in fact points in Euclidean space and their dissimilarities 
are their pairwise Euclidean distances. Then whether in the triple or quadruple comparisons setting, 
an exact ordinal embedding of these m items can he obtained in finite expected time. 

Proof. The algorithm we discuss is very naive: we sample m points lid from the uniform distribution 
on the unit ball, and repeat until the ordinal constraints are satisfied. Since checking the latter can 
be done in finite time, it suffices to show that there is a strictly positive probability that one such 
sample satisfies the ordinal constraints. Let denote the set of m-tuples (xi, ... ,Xm) ^ B{0, 1) 
that satisfy the ordinal constraints, meaning that \xi - Xj\ < ||xfc - xi\ when {i,j,k,i) e C. Seeing 
Xn as a subset of B{0, 1)"* c it is clearly open. And sampling xi, ... ,Xm hd from the uniform 
distribution on B(0, 1) results in sampling (xi,... ,Xm) from the uniform distribution on B(0, 1)™, 
which assigns a positive mass to any open set. □ 

In the second stage, each point that is not a landmark is embedded based on the order of 
its distances to the landmarks. We quickly mention the work Davenport (2013), who develops a 
convex method for performing this task. Here, we are contented with knowing that this can be 
done, for each point, in finite time, function of the number of landmarks. For example, a brute 
force approach starts by computing the Voronoi diagram of the landmarks, and iteratively repeats 
within each cell, creating a tree structure. Each point that is not a landmark is placed by going 
from the root to a leaf, and choosing any point in that leaf cell, say its barycenter. 

Thus, if there are i landmarks, the hrst stage is performed in expected time F(i), and the 
second stage is performed in time (n- i)G(£). The overall procedure is thus computed in expected 
time F{i) + (n - i)G(£). 

Remark 2. The procedure described above is not suggested as a practical means to perform ordinal 
embedding with a landmark design. The first stage, described in Proposition 1, has finite expected 
time, but likely not polynomial in the number of landmarks. For a practical method, we can suggest 
the following: 

1. Embed the landmarks using the method of Agarwal et al. (2007) (which solves a semidef- 
inite program) or the method of Terada and Von Luxburg (2014) (which uses an iterative 
minimization-majorization strategy). 

2. Embed the remaining points using the method of Davenport (2013) (which solves a quadratic 
program). 

Although practical and reasonable, we cannot provide any theoretical guarantees for this method. 


3 More proofs 

In this section we gather the remaining proofs and some auxiliary results. We introduce some 
additional notation and basic concepts. For zi,... ,Zm e let Aff( 2 :i ,... Zm) denote their affine 
hull, meaning the affine subspace they generate in M'^. For a vector x in a Euclidean space, let 
||x|| denote its Euclidean norm. For a matrix M e let ||M|| denote its usual operator norm, 

meaning, ||M|| = max{||Mx|| : ||x|| < 1} and ||M||i? = \Jir{IVF^M) its Frobenius norm. 

Regular simplexes. These will play a central role in our proofs. We say that zi,. .. ,Zm ^ 1^*^, with 
m > 2, form a regular simplex if their pairwise distances are all equal. We note that, necessarily, 
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m < d + 1, and that regular simplexes in the same Euclidean space and with same number of 
(distinct) nodes m are similarity transformations of each other — for example, segments (m = 2), 
equilateral triangles (m = 3), tetrahedron (m = 4). By recursion on the number of vertices, m, it is 
easy to prove the following. 

Lemma 6. Let zi,... ,Zm form a regular simplex with edge length 1 and let p, denote the barycenter 
of zi, ... ,Zm- Then ||/r - {m - l)/2m, and if z, zi, ..., Zm form a regular simplex, then 

llz - /i|| = \J (m + l)/2m. (In dimension m, there are exactly two such points z.) 

3.1 Proof of Theorem 1 

We assume d > 2. See (Kleindessner and von Luxburg, 2014) for the case d = 1. We divide the 
proof into several parts. 

Continuous extension. Lemma 4 implies that (f is locally uniformly continuous. Indeed, take 
xq e 12 and let r > 0 such that B(xo,r) c U and (j) is weakly isotonic on B{xo,r) n P. Applying 
Lemma 4 with V = B{xo,r) and A = P n B{xo,r) — so that (5_ff(A, E) = 0 because A is dense in V 
— and noting that = E, yields a constant Cr > 0 such that \\4>{x) -4){x')\\ < Cr\\x-x'\\^^^, for all 
x,x' e Ll n B(xo,r). Being locally uniformly continuous, we can uniquely extend (j) to a, continuous 
function on U, also denoted by cj). By continuity, this extension is locally weakly isotonic on U. 

Isosceles preservation. Sikorska and Szostok (2004) say that a function f '■ V c ^ pre¬ 
serves isosceles triangles if 

\\x-y\\ = \\x-z\\ ^ \\f{x)-f{y)\\ = \\f{x)-f{z)\\, Ix,y,z€V. 

In our case, by continuity, we also have that (j) preserves isosceles triangles locally. Indeed, for the 
sake of pedagogy, let u and r > 0 such that B{u,r) c U and cf is weakly isotonic on B{u,r). 
Take x,y,z e B{u, r/2) be such that ||x - y\\ = ||x - z\. For t e M, define zt = {I - t)x + tz. Let t > 1 
such that zt e B{u, r). Because ||x - y|| < t||x - z|| = ||x - zt||, we have ||i^(x) - 4>{y)\\ < \\4>{x) - (j){zt)\\. 
Letting t \ 1, we get \\4>{x) -(/>(y)|| < ||i;/>(x) -(l){z)\\ by continuity of (f. Since y and play the same 
role, the converse inequality is also true, and combined, yield an equality. 

Midpoint preservation. Let E c be convex. We say that a function f ■ V preserves 

midpoints if 

We now show that cf preserves midpoints, locally. Kleindessner and von Luxburg (2014) also do 
that, however, our arguments are closer to those of Sikorska and Szostok (2004), who make use of 
regular simplexes. The important fact is that a function that preserves isosceles preserves regular 
simplexes. Let u eU and r > 0 such that B(u,r) c U and (j) preserves isosceles on B(u,r). Take 
x,y e B{u, r/2), and let y = {x + y)/2. Let zi,... ,Zd form a regular simplex with barycenter p, and 
side length s, and such that ||x-Zj|| = s for all i. In other words, x, zi,..., z^ forms a regular simplex 
placed so that p is the barycenter of zi,... ,Zd. By symmetry, y, zi,..., Zd forms a regular simplex 
also. By Lemma 6, we have \zi -//||/||x - ^|| = \J{d - l)/((i + 1), so that zi,...,Zd e B{p,r/2) c 
B(u,r), by the triangle inequality and the fact that ||x - p\\ < r/2. Hence, (p{x), 4>{zi),... ,4>{zd) 
and 4>{y), 4>{zi),... ,(f{zd) are regular simplexes. If one of them is singular, so is the other one, in 
which case 4>{x) = 4>{y) = <j){p). Otherwise, necessarily (j){x) is the symmetric of (/>(y) with respect 
to AS{(/){zi),... ,(j){zd))', the only other possibility would be that 4>{x) = (p{y), but in that case we 
would still have that 4>(zi) = (p{x) for all i € [d], since ||x-Zj||/||x-/r|| = \/2d/(d + 1) by Lemma 6 — 
implying that ||x - ZjH < ||x - y|| — and (p is weakly isotonic in that neighborhood. So assume that 
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(^(x) is the symmetric of 4>{y) with respect to AS(4>{zi),... ,(j){zd))- For a € {x,y,//}, ||a- 2 :i|| is 
constant in i, and therefore so is ||(/>(a)-i;i)( 2 ;j)||, so that </>(«) belongs to the line of points equidistant 
to 4'{zi),..., (j){Z(i)■ This implies that x,y,fi are collinear. And because ||/i-x|| = ||/i-y||, we also 
have \\4>{n) - 4>{x)\\ = \\4>{n) -(j){y)\\, so that 4>{n) is necessarily the midpoint of 4>{x) and 4>{y). 

Conclusion. We arrived at the conclusion that (p can be extended to a continuous function on 
U that preserves midpoints locally. We then use the following simple results in sequence: with 
Lemma 7, we conclude that (p is locally affine; with Lemma 8, we conclude that (p is in fact affine 
on U ; and with Lemma 9, we conclude that (p is in fact a similarity on U. 

Lemma 7. Let V be a convex set of a Euclidean space and let f be a continuous function on V 
with values in a Euclidean space that preserves midpoints. Then f is an affine transformation. 

Proof. This result is in fact well-known, and we only provide a proof for completeness. It suffices 
to prove that / is such that /((I - t)x + ty) = (1 - t)f(x) + tf{y) for all x,y €V and all t e [0,1]. 
Starting with the fact that this is true when t = 1/2, by recursion we have that this is true when 
t is dyadic, meaning, of the form t = k2~^, where j > 1 and k <2^ are both integers. Since dyadic 
numbers are dense in [0,1], by continuity of /, we deduce the desired property. □ 

Lemma 8. A locally affine function over an open and connected subset of a Euclidean space is the 
restriction of an affine function over the whole space. 

Proof. Let U be the domain and / the function. Cover U with a countable number of open balls 
Bi,i e I such that / coincides with an affine function fi on Bi. Take e I distinct. Since U 
is connected, there must be a sequence i - ki,... ,km - j, all in I, such that B^^ n 0 for 

s e [m - 1]. Since Bk^ n is an open set, we must have fk^ = fks+i, and this being true for all s, 
it implies that fi = fj. □ 

Lemma 9. An affine function that preserves isosceles locally is a similarity transformation. 

Proof. Let / be an affine function that preserves isosceles in an open ball. Without loss of generality, 
we may assume that the ball is B(0,2) and that /(O) = 0 (so that / is linear). Fix uq e dB(0,l) 
and let a = ||/(tto)||. Take x e different from 0 and let u = x/||x||. We have ||/(®)||/||a:|| = ||/(ii)|| = 
\\f{u) - /(0)|| = ||/(uo) - /(0)|| = ||/('Uo)|| = a. Hence, ||/(x)|| = a||x||, valid for all x e M"*, and / 
being linear, this implies that / is a similarity. □ 

3.2 Auxiliary results 

We list here a number of auxiliary results that will be used in the proof of Theorem 3. 

The following result is a perturbation bound for trilateration, which is the process of locating 
a point based on its distance to landmark points. For a real matrix Z, let (Tk{Z) denote its k-ih. 
largest singular value. 

Lemma 10. Let zi,..., z^+i e such that Aff(^;i,..., z^+i) = and let Z denote the matrix with 
columns zi,..., Zd+i. Consider p, g e and define ai = \\p- Zi || and bi = \\q- Zi || for i e [d - 1 -1]. Then 

\\p-q\\ < ^\/dad{Z)~^ - aj - b'^ + ^Lil ^ \/dad{Z)~^ max\a‘^ - b"^]. 

Proof. Assume without loss of generality that Zd+i - 0. In that case, note that 0^+1 = ||p|| and 
bd+i = II<711 • Also, redehne Z as the matrix with columns zi,...,Zd, and note that the hrst d 
singular values remain unchanged. Since Aff( 2 i,...,there is a = {ai,... ,ad) e R*^ 
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and [3 = {(3i,..., [3^) g such that p = T,ie[d] and q = T,K[d]Pi^i = For p, we 

have of = ||p - Zjp = ||p|p + \zi\^ - 2zlZa for all i e [d], or in matrix form, Za = where 
u = (tti,... ,M(i) and Ui - “ o? + ll^iP- Similarly, we find Z^Z/3 = where v = {vi,... ,Vd) and 

Vi = - 6^ + ||zj|p. Hence, we have 

\\ZHp-q)\\ = \\Z^Za-Z^Z^\\ = ^||u-n|| = ^ / E («Li “ 

Y i&[d] 

X I 2 2 2 2 I I 2 21 

< -V d max|a^^]^ - -b^ + < v d max|aj - b^ \. 

2 i i 

Simultaneously, \\Z'^{p - q)\\ > a(i{Z)\\p - q\\. Combining both inequalities, we conclude. □ 

For p e [0,1), we say that zi,..., -2m eM"* form an ry-approximate regular simplex if 

min \\zi - ZjW > {1 - p) max ||Zi - Zj ||. 
i*j i*j 

Lemma 11. Let zi,...,Zm form an p-approximate regular simplex with maximum edge length A 
achieved by \\zi - 22 ||- There is a constant Cm and z{,...,Zm g Aff(zi,..., Zm) with z[ = zi and 
z^ = Z 2 and forming a regular simplex with edge length X, such that max* \ z[ - Zi\\ < XCmP- 

Proof. By scale equivariance, we may assume that A = 1. We use an induction on m. In what 
follows. Cm,Cm,Cm, etc, are constants that depend only on m. For m = 2, the statement is 
trivially true. Suppose that it is true for m > 2 and consider an ry-approximate regular simplex 
zi,... ,Zm+i G with maximum edge length 1. By changing d to m if needed, without loss of 
generality, assume that Aff( 2 :i,..., -2m+i) = In that case, zi,Zm is an ry-approximate regular 
simplex with maximum edge length achieved by H^i -Z 2 II = 1, and by the inductive hypothesis, this 
implies the existence oi z[,Zm ^ A := Aff(zi,..., Zm) with z[ = zi and = ^2 and forming a 
regular simplex of edge length 1, such that maxjg^^j \z[ - Zi\ < CmP for some constant Cm- Let 
p be the orthogonal projection of Zm+i onto A. Before continuing, let V be the set of such p 
obtained when hxing z[,... ,z!^ and then varying Zi e B(zl, CmP) for i e [m] and Zm+i among the 
points that make an ry-approximate regular simplex with zi,... ,Zm- Let p' be the barycenter of 
z'i,...,Zm and note that p' e V. Now, set 6 = \zm+i - p\- By the Pythagoras theorem, we have 
Up - = \zm+i - -2i|P - with 1 - ry < \zm+i - ^iW ^ 1) so that 0 < 1 - - ||p - Zi\\‘^ < 2p. By the 

triangle inequality, |||p - z-|| - ||p - Zi||| < H^i - z-|| < CmP, so that 

|||P-2^111^ - = \\\p-4\\ - \\p - Zi\\\{\\p - z'iW - \\p-Zi\\) < \\Zi-z'iWWzi-z'iW 

< Cmp{2 + CmP) < C'mP, 

using the fact that \p- Zi \ < H^m+i - Zi\\ < 1. Hence, 

V(z{q--\\q- z'if = 1 - d^ ± C'mP, Vr e [m]}. 

Since p' e V, we must therefore have ||p - z'p = \p' - z-p ± 2C^p. By Lemma 10, this implies that 
||p-p'|| < Vm - 1 ai^_i{[z[---Zm])‘2Clfp =■ C'ffp. Let be on the same side of A as Zm+i and such 
that z'l,..., Zm,z'.^^i form a regular simplex. Note that p' is the orthogonal projection of . 2^+1 onto 
A. By the Pythagoras theorem, applied multiple times, we obtain the following. First, we have 

IPm+l ~ ^™-+l II ~ \^m+l ~ P P ~ PP ~ Zm+l\\ 

= \\z'm+l-P'f -‘^(.^'m+1-PYi^m+l - P) + \\p - Zm+lf + \\p - pf 
= {\\z'm,l-p'\\-\\p-Zmrl\\f^\\p'-pf, 
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because 2:^+1 - n' and Zm+i - P are orthogonal to A, and therefore parallel to each other and both 
orthogonal to p' - p. For the second term, we already know that \\p' - p\\ < C^r], while the first 
term is bounded by (2(7^ + 2)‘^rf‘ since, on the one hand, 

ii4ri = = y - 21112 

while, on the other hand, 

\\p-Zm+lf = \\Zm+l - Zlf - \\p-Zif = 1 ± 2t/- \\p - Zif, 

and we know that \\p' - zjp = \\p - zip ± 20^7], Hence, we hnd that \z'^^i - Zm+iP ^ 

for some constant Cm+i function of m only. This shows that the induction hypothesis holds for 

m + 1 . □ 

Lemma 12. There are constants Cm, C'm > 0 such that, if z\,..., Zm form an r]-approximate regular 
simplex with maximum edge length X, then am-i {[zv-Zm])>XCm{l-C^r,). 

Proof. By scale equivariance, we may assume that A = 1. By Lemma 11, there is a constant and 
z[,..., z'm ^ Aff(zi, ... ,Zm) forming a regular simplex with edge length 1 such that max* \ z[ - Zi \ < 
C'mp. By Weyl’s inequality (Horn and Johnson, 1990, Cor 7.3.8), am-i{Z) > am-i{Z') - \Z - Z'\. 
On the one hand, is a positive constant depending only on m, while on the other hand, 

||Z - Z'W < ||Z - Z'Wf = Pi - z'P < □ 

Lemma 13. Let zi,...,Zm form an rj-approximate regular simplex with maximum edge length A 
and barycenter p. Let p e Aff(zi ,... ,Zm) and define 7 = max* ||p - Zjp - minj \\p - Zip. There is a 
eonstant Cm > 1 depending only on m such that \p - p\ < CmX'y when rj < IjCm- 

Proof. By scale equivariance, we may assume that A = 1. By Lemma 10, we have 
||p-^|| < lcrp_i([zr--Zm])max|||p- ZmP - ||p-Zip|. 

2 i 

By Lemma 12, there is a constant C^ such that o'f^_i{[zi---Zm]) < C'm when rj < IjC'm. And we 
also have max* |||p - z^P - ||p - 2 jp| < 7 . From this, we conclude. □ 

Lemma 14. Let '. K Q he isotonic, where A,Q c: Let v ^ and r > 0, and set e = 

dH{A,B{v,r)). There is C oc diam(Q)/r such that, for all x,x',x^,x^ e A with x,x' c B{v,3rlA) 
and for all 7 e (0, r/4 - 2e), 

\\x - x'W = \\x^ - x^\ ±7] ^ Wipi^x) -'ip{x')\\ = \\'ip{x^) -'ijj{x^)\\ ±C{p + e). ( 12 ) 

Proof. Let ^ = ||x - x'|| and Suppose that ^ < p + 2e, which implies that < 27 + 2e. 

In that case. Lemma 3 — where the constant there is denoted here by Ci oc diam{Q)/r — yields 
\\'fi{x) - 7 Ij{x')\\ < Ci{^ + e) < Ci{p + Se) and, similarly, \\7f{x^) - •ip{x^)\\ < Ci{2p + 3e). This proves 
(12). Henceforth, we assume that ^ > 7 + 2e. 

First assume that ^ In that case, we immediately have HV'C^;) -'ijj{x')\\ > ||'0(xp -'ijj{x^)\\. 
For the reverse, let yt = {1- t)x + tx', and note that \\yt - x\ = tf^. Take t = 1 - (7 + 2e)/^ and note 
that t e [0,1], so that yt e [xx'] c B(v,r), and therefore there is x* e A such that ||x* - yt\\ < e. 
We have ||x* - x|| < \\yt - x|| + ||x* - yt\\ <f,-p-£<f,\ so that \\fi{x) - 7p{x*)\\ < ||i/’(xp - 7p{x^)\\. 
Applying the triangle inequality and Lemma 3, we then have 

\\7p{x) - fiix*)\\ > \\7p{x) - fiix'fil - Wlpix') - '0(X*)|| 

> llV’(x) - fi{x')\\ -Cidlx'-x*!! +e). 
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with ||x' - x*|| < \x' -yt\ + \yt “ 3:*|| <y + 3e. 

When C we choose t = 1 + (ry + 2e)/^. Because x,x' c B{v,3rlA), we still have yt e B{v,r) 
because of the constraint on rj. The remaining arguments are analogous. 

When ^ repeating what we just did both ways and with rj - 0 yields the result. □ 

Lemma 15. Consider ip : isotonic, where A c Let V denote the convex hull of A. Set 

e = 5 h{^-,V) and c = diam(V'(A))/5diam(A). Then > c||x-x'|| for all x,x' € A such 

that \x - x'\ > 4e. 

Proof. We first prove that, if c> 0 and 77 > 4e are such that ||'0(x) - 7 /^( 3 ;') || < cry for all x,x' € A with 
\x - x'll < ry, then diam('0(A)) < c(4diam(A) + ry). Indeed, take x, x' e A. Let u = (x' - x)l\x' - x\ 
and L = ||x-x'||, and define yj = x + SjU where Sj = j{ri-3e) for j = 0,..., J ;= [L/(Ty-3e)J, and then 
let sj+i = L. By construction, yj e [xx'] c V, with r/o = x and yj+i = x'. Let xj e A be such that 
IIXj - r/jll < e, with xq = x and xj+i = x'. By the triangle inequality, ||xj+i - Xj\ < ||?/j+i - yj \ + 2e = 
Sj+i - Sj + 2e < ry. Hence, 

J 

||V;(x) -'0(x')|| < X! W'^^Xj) - ip{xj+i)\\ < (J + l)cry < c-— + cry < c(4diam(A) + ?y), 

jTo - 3e 

since ry - 3e > ry - 3ry/4 = ry/4 and L < diam(A). 

Now assume that ip is isotonic and suppose that ||r/^(x) - ip{x')\\ < c\\x - x'|| for some x,x' e A 
such that ry := ||x-x'|| > 4e. Then we have \\ip{x^) - ip{x^)\\ < crj when x^,x^ e A satisfy ||x^-x^|| < ry. 
We just showed that this implies that diam(r/;(A)) < c(4diam(H) +ry), and we conclude using the 
fact that ry < diam(A). □ 

The following result is on 1-nearest neighbor interpolation. 

Lemma 16. Let A he a subset of isolated points rn H c and set e = Sh{^, V). For any function 
Ip : A^ define its 1-nearest neighbor interpolation as ip '-V ^ as 

Hy) = 1., I M E A^A(y) := argmin||x-ry||. (13) 

I^a(?/)| ^^NAiy) 

Consider the modulus of continuity of ip, which for ry > 0 rs defined as u}{r]) = sup{||r/^(x) - '0(x')|| : 
X, x' e A, ||x - x'll <ry}. Then the modulus of continuity of ip, denoted Co, satisfies Co (y) <a;(ry + 2e). 
Moreover, for any y,y' and any x,x' e A such that ||x - y\\ < e and ||x' - y'\\ < e, 

\H{v) - fj{v')\\ = U{x) - ± 2w(2e). 

Proof. Fix ry > 0 and take y,y' such that ||r/ - r/1 < ry. We have ||x - r/|| < e for all x e N\{y) and 
IIx' - y^ll < e for all x' e N\{y'), so that ||x - x'|| < ||y - ?/'|| + 2e for all such x and x', by the triangle 
inequality. Therefore, 

Hiy) - ^(y')ll ^ sup {||r/7(x) - V’(a:')ll = X e NA{y),x' e N'aC?/')} 

< sup |||r/r(x) - ip{x')\\ : x,x' e A, ||x - x'll < ry + 2e| = Lo{r] + 2e). 

Since this is true for all y,y' such that \\y - y'\\ < ry, we conclude that a)(ry) < u;(ry + 2e). 

For the second part of the lemma, we have 

ll^(y) - '4’{y')\\ = U{x) - ^p{x')\\ ± \\ip{y) - ip{x)\\ ± \\ip{y’) - Ip{x’)\\, 
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where the second term is bounded by 

U{y)-i^{x)\\ <sup{||V^(x)-'(/>(®)|| :xeiVA(y)} 

< sup |||'i/'(x) - 'i/’(a^)|| • ||5^ - a^ll ^ 2e| < a;(2e), 

using the fact that ||a; - x|| < ||x - ?/|| + ||y - x|| < 2 e, and similarly for the third term. □ 

Let V c be convex. In our context, we say that f :V <->■ is rj-approximately midlinear if 

1 


+f{y))\\<71, Vx,yey. 


Lemma 17. Let V c be star-shaped with respect to some point in its interior. There is a 
constant C depending only on V such that, for any rj-approximately midlinear function f -.V 
there is a affine function T : such that sup^-^y \\f{x) - T(x)|| < Crj. 


Note that, if 1^ is a ball, then by invariance considerations, C only depends on d. 
Proof. This is a direct consequence of (Vestfrid, 2003, Th 1.4). 

We say that / : 1/ c is an e-isometry if 

\\x-y\\ -£< ||/(x)-/(y)|| < \\x-y\\ +£, 'ix,y^V. 

For a set V c define its thickness as 

6{V) = inf { diam(n^I^) : u e M'^, ||u|| = l}. 


□ 


Recalling the dehnition of p in (7), we note that OfV) > p{V), but that the two are distinct in 
general. 

Lemma 18. Let V cM.'^ be compact and such that 9{V) > r/diam(F) for some p > 0. There is a 
constant C depending only on d such that, if f -V is an e-isometry, then there is an isometry 
R ■■ y-y such that maxa,ey ||/(a:) -i?(x)|| < Cejp. 


Proof. This is a direct consequence of (Alestalo et ah, 2001, Th 3.3). 


□ 


Lemma 19. Let T : 


be an affine function that transforms a regular simplex of edge length 


1 into an p-approximate regular simplex of maximum edge length A > 0. There is a constant C, 
depending only on d, and an isometry R, such that ||r(x) - Aii( 3 :)|| < CXp for all x e B{0, 1 ). 

Proof. By invariance, we may assume T is linear and that the regular simplex is formed by 
0, zi,..., Zd and has edge length 1. Letting Wi = T{zi), we have that ... ,Wd form an p- 

approximate regular simplex of maximum edge length A := max* ||rci||. Lemma 11 gives 0,w[,... ,w'^ 
forming a regular simplex of edge length A such that max* \\wi - w^\\ < CiXp for some constant 
Cl. Let R be the orthogonal transformation such that R{zi) = w'^jX for all i e [h]. We have 
\T{zi) - XR{zi)\\ = \\wi - re'll < CiXp for all i. In matrix notation, letting Z := [zi... Zd\, we have 


\\TZ-XRZ\\ < \\TZ-XRZ\\f 


\ ^ 

\ i=i 


^ \Tzi - XRzi\^ < \/dmax \Tzi - XRzi\ < \/dCiXp. 


i€[d] 


At the same time, ||TZ-ARZ|| > ||T - Ai2||/||Z ^|| with ||Z ^|| = IjadiZ) = Ijad{\Qzi---Zd\) being a 
positive constant depending only on d. Hence, ||T- AR|| < {\fdlad{Z))CiXp =: C 2 Xp. □ 
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Lemma 20. Suppose that Si, 52 : are two affinities such that ||5i(x)-52(x)|| < 

T] for some y and r > 0. Then ||5i(x) - 52(a:)|| < 2r]\\x - y\\/r + y for all x e 

Proof. By translation and scale invariance, assume that y = 0 and r = 1. Let Li = Si - 5j(0). 
For X e -6(0,1), we ||-Li(a:) - .L 2 (a:)|| < ||5i(a:) - 52(x)|| + ||5i(0) - 52(0)|| < 2rj. Hence, for x e M'^, 
||Li(a:) - L 2 {x)\ < 277||x||, which in turn implies that ||5i(x) - 52(x)|| < ||-Li(x) - L 2 (x)|| + ||5i(0) - 
52(0)11 < 277|| x || + 77. □ 

3.3 Proof of Theorem 3 

Without loss of generality, we may assume that Dn '■= diam((/)„(H„)) > 1. Indeed, suppose that 
Dn < 1 , but different from 0 , for otherwise (fn is a degenerate similarity and the result follows. 
Let = L)f^(l)m which is isotonic on and satisfies diam(0„(Hji)) = 1. If the result is true for 
(fn, there is a similarity Sn such that max^jeti^ \4>n{x) - Sn{x)\ < Csn for some constant C. (We 
implicitly assume that the set (pn{^n) contains the origin, so that (fni^n) remains bounded.) We 
then have msiXx(^n^ \4'n{x) - 5n(a;)| < CDnSn < Csn, where Sn ■= DnSn is also a similarity. 

Let r = p{U), so that there is some u* such that B{Ui,,r) c U. Let A„ = n 6(tt*,r/2) and 
6n = diam((/)„(A„)). Let w be any unit-norm vector and define y± = u* ± (r/2-e„)r(;. Let e be 
such that ||x± - y±|| < En- Necessarily, x± e A„ because the distance from to 96(u*,r/2) exceeds 
En- Note that ||x_ - x+|| > ri := r - 4e„. By isotonicity, 

\\fn{x) - 4>n{x^)\\ < Wfnix-) - 4>n{x+)\\ < 6n, whenever ||x-x'|| <ri. (14) 

Let yi,... ,yK be a (ri/3)-packing of U, so that K < C{d\am.{U)lr)^ for some constant C > 0. Let 
e be such that \xi^^ - yk\ < so that U c 6 (yfc,n/S) c Ufce[ic] 6 (a:ij^,r 2 ), where 

'<’2 ■= ^ 1/3 + En- Let Zk = Xjj, for clarity. Take x,x' e Because U is open, it is path-connected, so 
there is a continuous curve 7 : [0,1] 1 -^ t/ such that 7 ( 0 ) = x and 7 ( 1 ) = x'. Let ko e [.A] be such 
that X € B{zkQ,r 2 ) and sq = 0. Then for j > 0, let sj+i = inf{s > Sj : 7(5) ^ U;g[j] B(zki,r 2 )}, and let 
kj+i € [K] be such that \\zkj_^.^ - 7 (sj-i-i)|| < En- Let J = min{j : Sj+i = 00 }, which is indeed finite. By 
construction, \\zkj - Zk^^^ || < 2x2 < n when En < r/10. By (14), we have \\4>n{zkj) - 4>n{zkj^.^)\\ < Sn- 
Thus, by the triangle inequality, \\4>n{x) - 4>n{x')\\ < JSn < KSn- This being true for all x,x' e 
this prove that Sn > DnjK oc 6„(diam([/)/r)“'^. 

1-NN interpolation. Let (pn denote the 1-NN interpolation of pn as in (13). We claim that there 
is a Cq oc Dnjr and Cq oc (diam(17)/r)"'^-D„/r such that pn satisfies the following properties: for 
all y,y',yKy^ e U, 


||<^n(y) -<^n(y')ll ^ CoiWy-y'W +En), (15) 

\\y-y'\\<W -y^\\- 4 :En ^ \\Pn{y)-pn{y')\\<Hn{y^)- 4 >n{y^)\\+CQEn, (16) 

and also 

Uniy) - Pniy')\\ > CQ\\y-y'\\ -C^En, 
if y,y' e 6(M*,r/2) satisfy \y-y'\ > lOe^, 

and 

\\y-y'\\ = lly^ -y^ll ±?? ^ Un{y) - Pn{y')\\ = ll<^n(y^) -<^n(y^)|| ±CQ{y + En), 
if y,y' e 6(M*,r/2), < r/120 and 0 < r/ < r/5. 

Indeed, let x, x', x\x^ e such that ||a: - ?/||, ||x' - y'||, ||a:^ - 2/'^ 11, \ x^ - y^|| ^ 


( 18 ) 
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For (15), we start by applying Lemma 16 to get 

||0n(y) -<^n(y')ll = Unix) - (j)n{x')\\ ± 2a;„(2e„) 

— ^n( 11^ ~ X II) + ^ ^n( ||y ~ 2/ || "*■ ‘^^n) 2W)T,(2£n)) 


where Wn is the modulus of continuity of We then use Lemma 3, which gives that iOnii]) < Crj 
for all r] and some C oc D^/r, to get a;„(||y - y'\\ + 2 e„) ± 2 a;„( 2 £„) < C{\\y - y'\\ + 6 e„). 

For (16), we first note that Hx - x'H < ||x^ - || by the triangle inequality, which in turn implies 

that \\(j)n{x) - (l)n{x')\\ < Wcpnix^) " (pn (x^)ll since (pn is isotonic. We then apply Lemma 16 to get 
that \(j)n{y) - ^n{y') \ ^ ll'^n( 2 /^) - <Pn{y^) \ + and conclude with Lemma 3 as for (15). 

For (17), we may apply Lemma 15 with A„. Let V be the convex hull of An, so that V c 
il(tt*,r/2). Let z be a point in that ball. If z ^ rt*, let w = (ti* - z)l\Ui, - 2 :||, and if z = «*, let w 
be any unit-norm vector. Dehne z' = z + £nW and notice that the distance from z' to dB(u*,rl2) 
exceeds £„. Therefore, if a; e is such that Hz' - a:|| < En, then necessarily, x e A„. We then note 
that ||z - x|| < 2£n- We conclude that 5//(A„, V) < 2e„. Since ||x - x'|| > ||y - y'|| - 2e„ > 4(2£:„), we 
get that \\(pn{x) - (pn{x')\\ >c||x-x'||, with c := diam((^„(A„))/5diam(A„) >Snl5r. We then apply 
Lemma 16 to obtain \\(pn{y) -<Pn{y')\\ ^ c||x-x'|| - 2 ci;„( 2 e„) > c||y-j/'|| - 2 (c + C')e„, using Lemma 3 
as for (15). 

For (18), note that x, x' e il(M*, r/2 + e„) c il(M*, 3r/4), and ||x - x'|| = ||x^ - x^\\ ± {rj + 4e„) by 
the triangle inequality. By Lemma 14 — where the constant there is denoted here by C oc Dnjr 
— this implies that 

\\(pn{x) - (pn{x)\\ = \\(pn{x^) - (pn{x^)\\ ±C'{r} + £n) 

when rj + 4e„ < r/4 - 2e„, which is true when < r/120 and y < r/5. We then apply Lemma 16 
together with Lemma 3, as for (15). 

CASE d = 1. This case is particularly simple. Note that U is a bounded open interval of M. We 
show that the function (pn is approximately midlinear on U. Take x,y and define y = {x + y)l2. 
By the fact that (pn takes its values in M, and (18), we have 


\{(pn{x) + (pn{y)) - $n{y) = h \$n{x) - (pn{y)\ “ \(pn{y) " <^n(h)l 


< C^Sn/2, 


when Enjx is small enough. Hence, <pn is (C'Qe,i)-approximate midlinear on U. By the result of 
Vestfrid (2003), namely Lemma 17, there is C oc 1 — since 17 is a ball — and an affine function 
Tn such that vaaiiy^u\(pn{y) -Tn{y)\ < CC^En- Since all affine transformations from M to M are 
(possibly degenerate) similarities, we conclude. 

CASE d>2. For the remaining of this subsection, we assume that d>2. 

Approximate midlinearity. We show that there is a constant C such that cpn is locally CEn- 
approximately midlinear. Take x,y e B{Ui,,rlA), and let /r = (x + y)/2. Let t > 0 be a constant to 
be set large enough later. 

If ||x- y|| < t£n, then by (15), (pn{x),^n{y) e B{^n{y),CQ{tl2 + !)£„), so that 

||<^n(h) - ^{<Pn{x) + (pn{y))\\ < Co{t/2 + l)e„. 

Therefore, assume that ||x-y|| > t£n- Let zi,..., be constructed as in the proof of Theorem 1. 
By construction, both x, zi,..., z^ and y, zi,..., z^ form regular simplexes, and /r is the barycenter 
of zi,..., Zrf. By Lemma 6 , for any i ^ j, 

||zi - y|| = \/{d- l)/ 2 d IIZj - Zjll = \/{d- l)l2d\/2dl{d+ 1 ) ||x - fi\\ < \\x - y\\/2, 
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which coupled with the fact that x, y e r/4) yields that Zi e B{u*,rj2) for all i. Now, let zq = x. 

By (18), we have miiii^j \\$n{zi) - ^n{zj)\\ > maxjj \\^n{zi) - ^n{zj)\\ - C^En- Let q = yJdl{2d + 2). 
By (17) and Lemma 6, 

\\K{Zi) - (i)n{Zj)\\ > Coll^i - ZjW - CqEu = clcd\\x-y\\ - C^En > (cgCdt - Cq )en. 

Hence, assuming t > 2CQlcgCd, we have minj^j \\^n{zi) - Mzj)\\ > (1 - ? 7 )maxjj \\^nizi) - ^n{zj)\\, 
where r] 2C'Q/(cQCrft). In that case, , 4'n{zi),..., (j)nizd) form a ? 7 -approximate regular 

simplex. By symmetry, the same is true of (pniu), ^n{zi), ■ ■ ■ ,(j)n{zd)- 

Define A = \\^n{x) - ^niy)\\- By Lemma 6, = Cd\\x-y\\ < \\x-y\\-4En when t > 4/(l-Cd), 

since ||x - y\\ > tEn and Cd < 1. By (16), this implies that \(f)n{zi) - (j)n{zj)\ < A + C^En- By (17), 
A > (cgt - CQ)En, so that A + C^En < 2A since we already assumed that t > 2(70/cqQ > 2CqIcq. 

For a € {x,y,y}, ||a- 2 :j|| is constant in i e [d]. Therefore, by (18), minj ||(?i„(a) - (j)n{zi)\\ > 
maxj ||</>n(a) - (pn{zi)\\ - C^En- Define as the orthogonal projection of (pnio-) onto the affine 
space A := Aff((^„( 2 :i),..., 4>n{zd)) and let 5a = ||<^n(o) - ^a\\- By the Pythagoras theorem, we have 
Ua-^n{zi)f = \\^n{a) - ^nizi)f-Sl. In particular, 

maxilla - ^n{zi)f -min ||^a - ^n{zi)f = max||(^„(a) -^n{zi)f -min \\^n{a) - ^n{zi)f 

III I 

< 2CQEnmm \\4>n{a) - ^n{zi)\\ + (Cgen)^ < CiEn, 

I 

where Ci 2(7qD„ + C^r, once < r. Let C denotes the barycenter of i^n(zi),..., 4>n{zd)- Assume 
that t is sufficiently large that y < 1/(72, where (72 1 is the constant of Lemma 13. By that 
lemma, and the fact that 4>n{zi),... ,(j)n{zd) form a ry-approximate regular simplex of maximum 
edge length bounded by A, we have \\^a - Cll ^ (72ACie„. Let L be the line passing through ( and 
perpendicular to A. We just proved that 4)n{x),4>n{y),4>n{y) are within distance C^Xeu from L, 
where := CiC 2 - 

Let ^ denote the orthogonal projection of (j)n{y) onto {(pn{x)(j)n{y))- Since ||x-/i|| = ||y-^||, we 
can apply (18) to get 

|||e-<^n(x)p-||C-0n(2/)lH 
= \Un{y) - ^n{x)f - Un{y) " ^n{y)f\ 

= |ll<^n(/^) -<^n(x)|| + \\4>n{y) - My)\\\ X | ||0n(/^) " <^n(x) || - ||<^n(/^) " 0n(l/) || | 

< 4(7q XEn, 

using the fact that max(||(/>„(//) - (/)„(x)||, ||(/>„(/r) - 4>n{y)\\) < A + C^En < 2A, due to (16) and 
||x - /i|| = \y - /r|| = |||x - y|| < ||x - y\ - 4e„ when t is large enough. By Lemma 13, we then 
obtain ||^ - ^{(f>n{x) + ^n{y))\\ ^ C' 4 Ae„ for some constant C 4 oc (7 q. In particular, recalling that 
A = \\^n{x) -^n{y)\\, this implies that ^ e [^n{x)4>n{y)] when < 1/20^. 

It remains to argue that <i>n{y) is close to We already know that ^n{x),(f>n{y),4'n{y) are 
within distance C^XEn from L, and by convexity, the same must be true of Let M = {(pn{x)4’n{y)) 
and 6 = z(^L,M). Let Pm denote the orthogonal projection onto M, when M is a linear subspace. 
By Pythagoras theorem, 

A^ = \^n{x) - ^n{y)f = \PL{^n{x) - ^n{y))f + ||Pli ((^n(x) - (^n(y)) P 
<(cosd)2A2 + (2(73Ae„)^ 
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implying that sin0 < 2 C 2 £n- Since \Pl - Pm\ = sin0 and - Os parallel to M, we also have 

= WPUMf^) - OP + IIPlOOO) -OP 

< {smOfW^ni^) - Cf + {2Cz\£nf, 

so that || 0 n(/^) - ‘^11 ^ 2 C 3 X£nlcos9 < 2C3Ae„/yT^02COrO ^ C^XSn, for some constant oc C 3 , 
once C^En is small enough. 

We conclude that ||0(//) - ^{^n{x) + ^n{y))\\ ^ (O + Cb)X£n, by the triangle inequality. 

Approximate affinity. We now know that (j)n is Cen-approximate midlinear on B{u*,rlA) for 
some constant C oc Cq(Z)„ + r) oc Cq (diam((5) + r). This implies, by the result of Vestfrid (2003), 
that is Lemma 17, that there is an affine function T„ such ||0(x) - Tn{x)\ < Clsn for all x e W, 
for some constant Ci oc rC oc tCq( diam((5) + r). 

Approximate similarity. (Reinitialize the constants Ck,k > 1.) We saw above that (/>„ trans¬ 
forms the regular simplex zq{= x),zi,... ,Zd with height denoted h satisfying h > ten /2 into a 
r/-approximate one, where r] - 2 Cq j{cQCdt). In what follows, choose these points so that they are all 
in i?(u*,r/2) and the simplex has height h > rjS. (From here on, reinitialize the variables x,y,X, 
etc.) We can then take t = rl4:£n-, yielding r] = Ci£n for a constant Ci oc C^Kc^r). By the triangle 
inequality, we have 

min ||rn(Zi) -Tn(Zj) II > min ||(^n(^i) “ <^n(^i)|| -2C^£n 

> (1 -C'ien)max|| 0 n( 2 i) - (t)n{zj)\\ - 2 C^£n 

i*j 

> maxIlTn/^i) -Tn/Zj)!! - {^Cl + Ci5n)£n- 


By the triangle inequality and (17), 

7n := max||rn(2:i) -rn(Zj)|| > max\\4>n{zi) - (f)n{zj)\\ - 2Ci£n 

^,3 

> Cq max \\zi - Zj || - Cg Sn - 2Cl£n > Cgr/S - {Cq + 2Cl )£„• 
hj 

Hence, we find that Tn{zQ),... ,Tn{zd) form a C' 2 en-appi'oximate regular simplex, where C 2 ■= 
(dC^ -H Ci5n)!{ cqT jS - (Cq -h 2Cl)£n)- Note that its maximum edge length is bounded as follows: 

7n ^ max II ^ni^Zi) — ^rii^Zj ) || + 2C^ £ji < Sn + 2C^ £n ^ 26n , 

when 2Ci£n < Sn- By Lemma 19, there is a constant C 3 > 0 and an isometry 72*, such that we have 
maxxcvy \\Tn{x) - An72))(x)|| < C 3 XnC 2 £n, where An := 7 n//i- Because r /8 < h <r and the bounds on 
7 n above, there is a constant C^ > 1 such that 


l/C*2<Xn<C*2. (19) 

This implies that 

Unix) - XnKix)\\ < ||.^n(x) -rn(x)|| + ||rn(x) - AnK(x) || < (C^* + C 3 C 2 C 2 )£„ =: C^n- (20) 

Covering and conclnsion. (Reinitialize the constants Ck-,k > 1.) Let ui = u* and let tt 2 ) • • •) uk ^ 
U be such that ui,... ,uk form a maximal (r/16)-packing of U. (The number 16 is not essential 
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here, but will play a role in the proof of Theorem 4.) Note that U = C/i u ■■■ u Uk where Uk ■= 
U n B{uk,rlA), and note that 17* := Ui c U. For u,u' e Uk, there are w,w' e C/* such that 
= ||u-u'||. Define (pn = (pnl^n- By (18), and then (19)-(20), we have 

\\^n{u) - ^niu')\\ = \\^n{w) " ^n{'w')\\ ± C^EnlXn 

= lire - re'll ± (Cg + C^)enlC2 =■ ||re - re'H ± Cie„. 


Let 


^1 = min ■ 


0{Uk) 


( 21 ) 


k diam(17fc)’ 

which is strictly positive. The result of Alestalo et al. (2001), namely Lemma 18, gives a constant 
C 2 oc and an isometry Rk such that max^gf/j, \\(pn{u) - Rk{u)\\ < C 2 £n- 
Let 

6 = - min {p{Uk n C/fc') : C/fc n Uk' * 0 }. (22) 

Take k,k' ^ [K] such that Uk n Uk' 4 0, so that there is u e 17 such that i?(tt, ^ 2 ) ^Uk^Uk' ■ Since 

max \\Rk{x) - Rk'{x)\\ < max \\Rk{x) - ^n{x)\\ + \\4>n{x) - Rk'{x)\\ 
x^B(u,^ 2) xeUknUj, 

< max||i?fc(3:) - 4>n{x)\\ + max \\^n{x) - Rk'{x)\\ < 2C2£n, 

X^Uk 


we have \\Rk{x) - Rk'{x)\\ < (2||x - u ||/^2 + l) 2 C' 2 en for all x e M'^, by Lemma 20. Hence, \\Rk{x) - 
Rk'{x)\\ < (2diam(17)/^2 + l) 2 C 2 en =■ C's^n for all x € 17. If instead Uk n Uk' - 0, we do as follows. 
Since U is connected, there is a sequence ko - k, ki,..., k^ = k' in [iL], such that Uki n 17^.^^ ^ 0 . 
We thus have ||i?fc.(x) - i?fc.^^(x)|| < C'^Sn- By the triangle inequality, we conclude that 

maxa-gf/ \\Rk{x) - Rk'{x)\\ < KC^Sn for any k,k' e [K], Noting that i?i = i?* (since 17i = 17*), for 
any k € [iL] and x €Uk, 

Unix) - Rnix)\\ < \\Rkix) - i?l(x)|| + C2£n < (KC^ + C2)£n- 


We conclude that, for any x e 17, 

||<^„(X) - XnRnix)\\ < (KCs + C2)Xn£n < {KC^ + C2)CUn =■ CaEu. (23) 


This concludes the proof when d> 2. 

A refinement of the constant. Assume now that U = U^ for some h > 0. Tracking the constants 
above, we see that they all depend only on (d,/9(17),diam(17),diam((5)), as well as and ^2 
defined in (21) and (22), respectively. We note that diam(17fc) < r and piUk) > min(r/2,/i) by 
Lemma 21, so that .^1 > min(r/2,/i)/r. To bound ^ 2 , we can do as we did at the beginning of this 
section, so that at the end of that section, we can restrict our attention to chains /cg,..., km where 
\\ukj - UkjU\ - 2?’/16 = r/ 8 . To be sure, fix k,k' e [iL] and let 7 : [0,1] 1 -^ 17 be a curve such that 
7 ( 0 ) = Uk and 7 ( 1 ) = Uk'- Define sq = 0 and then Sj+i = inf{s > Sj ■ || 7 (s) - Uk^W > r/lQ}, and 
let kj+i € [A] be such that || 7 (sj+i) -'Ufcj+i || < r/16, which is well-defined since iuk,k e [K]) is a 
(r/16)-packing of U. We then have 

\\ukj -UkjUl ^ W'^kj -7('Si)|| - hC'Sj+i) - "r/lG + r/16 = r/ 8 . 

We can therefore redefine ^2 m (22) as |min{/3(17fc n Uk') ■ \\uk - Uk'\\ < r/8}. Because U = U^, 
for each k e [77], there is Vk such that Uk e H(rfc,min(r/16,/i)) c U. By the triangle inequality, 
B(rfc,min(r/16,L)) c Uk' when \\uk - life'll ^ r/S, so that ^2 > min(r/16,/i). So we see that every¬ 
thing depends on (d,/i,/3(17),diam(17),diam((5)). The second part of the theorem now follows by 
invariance considerations. 
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3.4 Proof of Lemma 5 

Let c = ess mi If f and C = esssiipjjf, which by assumption belong to ( 0 ,oo). Fix i e [n] and let 
Ni = #{j * i ■ \\xj - Xi\\ < r}. For j i, pi{j) := F{\\xj - Xi\\ < r) = fB{xi,r) /(^)dw- For an upper 
bound, we have 

Pi{j) < CYo\{B{xi,r) r\U) < CYo\{B{xi,r)) = =: Q, 

where Vol denotes the Lebesgue measure in and Cd is the volume of the unit ball in Hence, 
P(-/Vj > 2(n - 1)Q) < P(Bin(n - 1,Q) > 2(n - 1)Q) < by Bennett’s inequality for the 

binomial distribution. By the union bound, we conclude that maxjiVj < 2(n- 1)Q with probability 
at least 1 - which tends to 1 if nr'^ > Cglogn and Cq > 0 is sufficiently large. 

For a lower bound, we use the following lemma. 

Lemma 21. Suppose t/ c M'’* is open and such that U = for some h > 0. Then for any x e U 
and any r > 0, B(x,r) n U contains a ball of radius min(r,/i)/2. Moreover, the closure of that ball 
contains x. 

Proof. By definition, there is y e U such that x e B{y,h) c U. We then have B{x,r) n t/ d 
B(x,r) n B(y,h), so it suffices to show that the latter contains a ball of radius min(r,/i)/2. By 
symmetry, we may assume that r < h. If ||x - y\\ < r/2, then B(x,r/2) c B{y,h) and we are done. 
Otherwise, let 2 = {l-t)x+ty with t ■= r/2||x-y|| e (0,1), and note that B{z,rl2) c B{x,r)nB{y,h) 
and x e (9B(z,r/2). □ 

Now that Lemma 21 is established, we apply it to get 

Pi{j) > cYo\{B{xi,r) nU)> cCd(min(r,/i)/2)'^ =: q. 

Hence, P(W < (n - l)(?/2) < P(Bin(n - l,q) < {n - l)g/2) < gy the union bound, we 

conclude that miuj W > (n - l )(?/2 with probability at least 1 - which tends to 1 if 

nr^ > Cl logn and Ci > 0 is sufficiently large. (Recall that h is fixed.) 

3.5 More auxiliary results 

We list here a few additional of auxiliary results that will be used in the proof of Theorem 4. 

For V c and x,x' e V, define the intrinsic metric 

dv(x,x') = sup|l : 37 : [0,L] 1 -^ V, 1-Lipschitz, with 7 ( 0 ) = x,'y{L) = x'J, 

where 7 is 1-Lipschitz if || 7 (s) - 7 (t)|| < \s - t\ for all s,t e [0,L]. If no such curve exists, set 
5y{x,x') = 00 . The intrinsic diameter of V is defined as sup{hy(x,x') : x,x' e V}. We note that, 
if L := 6v{x,x') < 00 , then there is a curve 7 c R with length L joining x and x'. Recall that a 
curve with hnite length is said to be rectihable. See (Burago et ah, 2001) for a detailed account of 
intrinsic metrics. 

For U and h > 0, let = {x eU ■ B{x,h) c U}. This is referred to as an erosion (of the 
set U) in mathematical morphology. 

Lemma 22. If U is open and connected, then for each pair of points x,x' e U, there is h> t) 
and a rectifiable curve within joining x and x'. 
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Proof. Take x,x' e U. By taking an intersection with an open ball that contains x,x', if needed, 
we may assume without loss of generality that U is bounded. Since every connected open set in a 
Euclidean space is also path-connected (Waldmann, 2014, Example 2.5.13), there is a continuous 
curve 7 : [0,1] 17 such that 7 ( 0 ) = x and 7 ( 1 ) = x'. A priori, 7 could have infinite length. However, 

7 (e 7 ([ 0 , 1 ])) is compact. For each t e [0,1], let r(t) > 0 be such that Bt ■= c U. 

Since 7 c Ute[o,i] Bt, there is 0 < such that 7 c Uj 6 [m] Since 7 is connected, 

necessarily, for all j e [m-1] there is Sj e [tj,tj+i] such that 7 (sj) e Bf. . Let sq = 0 and Sm = 1- 

Then [ 7 ('Sj) 7 ('Sj+i)] c cU for all j e 0,..., m - 1, and therefore the polygonal line defined by 
X = 7 (so), 7 (si),... , 7 (sm-i), 7 (sm) = x' is inside \Jj^[m] Btj where r := min^gf^] r{tj) > 0. By 

construction, this polygonal line joins x and x', and is also rectifiable since it has a finite number 
of vertices. □ 

Lemma 23. Suppose 1 / c is bounded, connected, and such that U = for some h > 0. Then 
there is hf > 0 such that, for all h' e [ 0 , hf\, the intrinsic diameter of 17®^ is finite. 

Proof. Let V = 1/®^. By assumption, for all x € U, there is y € V such that x e B{y,h) c U. In 
particular, U d V 0 . 

Let Vi be a connected component of V. Pick yi e Vi and note that Bi ;= B{yi,h) c 17 by 
definition, and also Bi c Vi because Bi is connected. Let be the volume of the unit ball in M'^. 
Since the connected components are disjoint and each has volume at least Cdh'^ while U has volume 
at most (j(i(diam(17)/2)'^, V can have at most [(diam(17)/2/i)‘^] connected components, which we 
now denote by Vi,..., Vk. Pick yk e 14 for each k e [K]. Applying Lemma 22, for each pair of 
distinct k,k' e [K], there is a rectifiable (i.e., finite-length) path ^ U joining yk and yk'. By 
Lemma 22, the length of 7 fc,fc', denoted Dk^k'i is finite, and there is hk^w > 0 such that ''^k,k' ^ 17®^'=’'='. 
Let = Tiiayik,k'i[K] B>k,k' and /ij = mmk^k'e[K] hk,k'- 

We now show that each connected component I 4 has finite diameter in the intrinsic metric 
of V := Since Vk is bounded, there is xi,...,Xm ^ Vk such that I 4 c Qj) where 

Qj ■= B{xj, /i/2) c V'. Take any x,x' e I 4 . Let j,j' e [m^] be such that x e Qj and x' e Qj'. Since 
Vk is connected, there is a sequence j = ... ,jsu - f ^ [^fc] such that Qj^ n ^ 0 for all 

s = 0,..., Choose Zs e Qj,, n and let zq = x and zg^, = x'. Then [zs^^s+i] Qjs+i for all s. 
Let L be the polygonal line formed by zq,. .., zg^.. By construction, L c uf^o Qp it joins x 

and x', and has length at most {Sk + l)2h. Hence, Sv'(x,x') < (Sk + l)2h < 2{mk + l)/i. This being 
valid for all x,x' e I 4 , we proved that Vk has diameter at most Dk ■= 2{mk + l)h in the intrinsic 
metric of V'. Let D^, = maxfcg[;^] Dk. 

Now take hj e [0,/ij] and any x,x' e 17®^. Let y,y' ^ V be such that x e B{y,h) and x' e 
B{y',h). Let k,k' e [AT] be such that y ^Vk and y' e Vk>. There are curves 7 , 7 ' c 17' of length at 
most H* such that 7 joins y and yk, while 7 ' joins y' and yk'. We then join yk and yk' with 'jk^k' 
All together, we have the curve [xy] u 7 u 'yk,k' u 7 ' u [y'x'], which joins x and x', lies entirely in 
17®^t J and has length bounded hy h + H* + D^ + H* + h =: D. And this is true for any pair of such 
points. □ 

Lemma 24. Suppose that 81,82 : M'* 1 -^ M'* are two affinities such that maxj ||Si( 2 ;j) - S 2 {zj)\\ < e, 
where ZQ,...,Zd form in a rj-approximate regular simplex with minimum edge length at least A. 
There is C > 0 depending only on d such that, if rj < 1/C, then ||5i(x) - 52 (x)|| < Ce||x - zo||/A + e 
for all X e 

Proof. Note that this is closely related to Lemma 20. By translation and scale invariance, assume 
that 2:0 = 0 and A = 1. Let Li = Si - Si{0). We have \\Li(zj) - L 2 {zj)\\ < || 5 i( 2 ;j) - S 2 {zj)\\ + ||5'i(0) - 
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£'2(0)11 < 2e. Let Z denote the matrix with columns z\^... In matrix notation, we have 

||(Li - L2)Z\f = ||(Li-L2)zjP < 2Vde. 

We also have ||(Li - L 2 )Z\\f > ||(Li - L2)-Z^|| ^ crd{Z)\\Li - L2II, and by Lemma 12, ad{Z) = 
(7rf([^;o, Z]) > 1/Ci when 77 < 1/Ci, where Ci depends only on d. In that case, ||Li - L2II < €26 
for another constant C2. Equivalently, for x e ||Li(x) - L2(x)|| < C'2e||x||, which in turn implies 
that ||5'i(x) - S'2(a;)|| < \\Li{x) - L2(x)|| + ||5'i(0) - £2(0)11 < C'2e||x|| +e. □ 

3.6 Proof of Theorem 4 

Because (pn is bounded independently of n, we may assume without loss of generality that CoEn < Tn 
and Corn < h for all n, where Cq > 1 will be chosen large enough later on. 

Take y e U and let n B(y,rn) and Qy = cpni^y)- We first show that there is Ci oc 

diam(Q) I p(U) such that, for any y ^U, d\am.{Qy) < Cirn- For this, we mimic the proof of Lemma 3. 
Take x,x' e Qy such that ^ := \\4>n{x) - 4>n{x')\\ = diam((5j^). Let u be such that B{u, p{U)) c U. 
Let yi, ■ • ■ ,2/m be an (r„ + 2e„)-packing of B{u, p{U)) with m > Ai{p{U)lrn)'^ for some Ai oc 1. 
Then let {xi^ : s e [m]} c be such that max^gj-^j ||2/s - < En- By the triangle inequality, for 

all s*t,we have \\xi^-xp\\ > \\ys-yt\\-2En >rn> ||a:-a;'||- By (10), we have \\(t>n{xi^) -4>n{xp)\\ > C, 
so that (pnixi^)-, ■ ■ ■ ,4>n{xi^) form a ^-packing. Therefore m < ^2(diam(Q)/^)'^ for some A 2 oc 1. 
We conclude that ^ < {A 2 IAi)^l'^{d\a'm{Q)lp{U))rn ='■ Cir^- 

We apply Theorem 3 to Uy := B(y,rn) and Qy. With the fact that 6H{Qy,Uy) < 2En — as 
we saw in the proof of (17) — and invariance considerations, we obtain a constant C oc 1 and a 
similarity Sy such that max^ej/y \\(j)n{x) - Sy{x)\\ < C{diam(Qy)lrn)En < CCiEn =■ C 2 En. (Note that 
all the quantities with subscript y depend also on n, but this will be left implicit.) 

Fix y* e I/®*'". For x e Qn, there is y e such that x ^Uy. Assume 7 is parameterized by 

arc length and let be given by Lemma 23 and let D denote the intrinsic diameter of . Then 
assuming rn < there is a curve 7 c L'®'’" of length L < D joining y* and y. Let yo - y*, yj = '^{jxn) 
for j = 0 ,..., J [L/r„J, and then yj+i = y. We have m.a-^^^Uy.nUy.^.^ \\Syy{z) - £y,+i(5;)|| < 2C2En 
by the triangle inequality. We also have p{Uy- n > Vn, because \\yj - yj+i\\ < Vn- Let Vj be 

such that B(vj,rnl2) c Uy. n Uy.^.^^. Fix j and let Vj^,... ,Vj^d denote a regular simplex inscribed 
in the ball B{vj,rnlA). Let oc denote its edge length. Then let Xj^,... ,Xs^d ^ Pn be such 
that maxfc \\xj^k - Vj^kW ^ £n- When Cq is large enough, xj^,... ,Xj^d ^ B(vj,rnl2) by the triangle 
inequality. Moreover, max^^; < A„ + 2en, as well as min^^; \\xj^k-Xj^i\\ > Xn-^En- When Cq 

is large enough, Fj := {xj^,... ,Xj^d} is therefore an y-approximate regular simplex, with ?/ oc e^/rn, 
and minimum edge length oc r„. Now, since max^ \\Syj{xj^k) - >S'yj_n(®i,fc)ll ^ ‘^C 2 En, by Lemma 24, 
for all 2: e ||£j/j(2:)-£yj+i(2:)|| < CC2En\\z-Xjfi\\/rn+2C2En for some C oc 1, assuming en/r^ < IjC. 
In particular, by the fact that ||x - Xj^W < diam([/), this gives \\Syj{x) - £yj+i(®)|| < Cse^/rn for 
some C3 oc diam(t/)C'2. Hence, 

||£j/,(x) - £y(x)|| < (J+ l)C'3en/rn < CiEnlr^, 


since J < Ljrn < Djrn- 

This being true for any arbitrary x e Qn, we conclude that 

max \\(j)n{x) - £j;*(x)|| < CiEnlri + <£ 26 ^ < C^Enlrl- 

X£U,n 
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4 Discussion 

This paper builds on (Kleindessner and von Luxburg, 2014) to provide some theory for ordinal 
embedding, an important problem in multivariate statistics (aka unsupervised learning). We leave 
open two main problems: 

• What are the optimal rates of convergence for ordinal embedding with all triple and quadru¬ 
ple comparisons? 

• What is the minimum size oi K = for consistency of ordinal embedding based on the 
X-nearest neighbor distance comparisons? 

We note that we only studied the large sample behavior of exact embedding methods. In partic¬ 
ular, we did not discuss or proposed any methodology for producing such an embedding. For this, 
we refer the reader to (Agarwal et ah, 2007; Borg and Groenen, 2005; Terada and Von Luxburg, 
2014) and references therein. In fact, the practice of ordinal embedding raises a number of other 
questions in terms of theory, for instance: 

• How many flawed comparisons can be tolerated? 
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